"Delivering Signals for Fun and Profit" Understanding, exploiting and preventing signal-handling related vulnerabilities. Michal Zalewski (C) Copyright 2001 BindView Corporation 0) Introduction --------------- According to a popular belief, writing signal handlers has little or nothing to do with secure programming, as long as handler code itself looks good. At the same time, there have been discussions on functions that shall be invoked from handlers, and functions that shall never, ever be used there. Most Unix systems provide a standarized set of signal-safe library calls. Few systems have extensive documentation of signal-safe calls - that includes OpenBSD, Solaris, etc.: http://www.openbsd.org/cgi-bin/man.cgi?query=sigaction: "The following functions are either reentrant or not interruptible by sig- nals and are async-signal safe. Therefore applications may invoke them, without restriction, from signal-catching functions: _exit(2), access(2), alarm(3), cfgetispeed(3), cfgetospeed(3), cfsetispeed(3), cfsetospeed(3), chdir(2), chmod(2), chown(2), close(2), creat(2), dup(2), dup2(2), execle(2), execve(2), fcntl(2), fork(2), fpathconf(2), fstat(2), fsync(2), getegid(2), geteuid(2), getgid(2), getgroups(2), getpgrp(2), getpid(2), getppid(2), getuid(2), kill(2), link(2), lseek(2), mkdir(2), mkfifo(2), open(2), pathconf(2), pause(2), pipe(2), raise(3), read(2), rename(2), rmdir(2), setgid(2), setpgid(2), setsid(2), setuid(2), sigaction(2), sigaddset(3), sigdelset(3), sigemptyset(3), sigfillset(3), sigismember(3), signal(3), sigpending(2), sigprocmask(2), sigsuspend(2), sleep(3), stat(2), sysconf(3), tcdrain(3), tcflow(3), tcflush(3), tcgetattr(3), tcgetpgrp(3), tcsendbreak(3), tcsetattr(3), tcsetpgrp(3), time(3), times(3), umask(2), uname(3), unlink(2), utime(3), wait(2), waitpid(2), write(2). sigpause(3), sigset(3). All functions not in the above list are considered to be unsafe with re- spect to signals. That is to say, the behaviour of such functions when called from a signal handler is undefined. In general though, signal handlers should do little more than set a flag; most other actions are not safe." It is suggested to take special care when performing any non-atomic operations while signal delivery is not blocked, and/or not to rely on internal program state in signal handler. Generally, signal handlers should do not much more than setting a flag, whenever it is acceptable. Unfortunately, there were no known, practical security considerations of such bad coding practices. And while signal can be delivered _anywhere_ during the userspace execution of given program, most of programmers never take enough care to avoid potential implications caused by this fact. Approximately 80 to 90% of signal handlers we have examined were written in insecure manner. This paper is an attempt to demonstrate and analyze actual risks caused by this kind of coding practices, and to discuss threat scenarios that can be used by an attacker in order to escalate local privileges, or, sometimes, gain remote access to a machine. This class of vulnerabilities affects numerous complex setuid programs (Sendmail, screen, pppd, etc.) and several network daemons (ftpd, httpd and so on). Thanks to Theo de Raadt for bringing this problem to my attention; to Przemyslaw Frasunek for remote attack possibilities discussion; Dvorak, Chris Evans and Pekka Savola for outstanding contribution to heap corruption attacks field; Gregory Neil Shapiro and Solar Designer for their comments on the issues discussed below. Additional thanks to Mark Loveless, Dave Mann, Matt Power and other RAZOR team members for their support and reviews. 1) Impact: handler re-entry (Sendmail case) ------------------------------------------- Before we discuss more generalized attack scenarios, I would like to explain signal handler races starting with very simple and clean example. We would try to exploit non-atomic signal handler. The following code generalizes, in simplified way, very common bad coding practice (which is present, for example, in setuid root Sendmail program up to 8.11.3 and 8.12.0.Beta7): /********************************************************* * This is a generic verbose signal handler - does some * * logging and cleanup, probably calling other routines. * *********************************************************/ void sighndlr(int dummy) { syslog(LOG_NOTICE,user_dependent_data); // *** Initial cleanup code, calling the following somewhere: free(global_ptr2); free(global_ptr1); // *** 1 *** >> Additional clean-up code - unlink tmp files, etc << exit(0); } /************************************************** * This is a signal handler declaration somewhere * * at the beginning of main code. * **************************************************/ signal(SIGHUP,sighndlr); signal(SIGTERM,sighndlr); // *** Other initialization routines, and global pointer // *** assignment somewhere in the code (we assume that // *** nnn is partially user-dependent, yyy does not have to be): global_ptr1=malloc(nnn); global_ptr2=malloc(yyy); // *** 2 *** >> further processing, allocated memory << // *** 2 *** >> is filled with any data, etc... << This code seems to be pretty immune to any kind of security compromises. But this is just an illusion. By delivering one of the signals handled by sighndlr() function somewhere in the middle of main code execution (marked as '*** 2 ***' in above example) code execution would reach handler function. Let's assume we delivered SIGHUP. Syslog message is written, two pointers are freed, and some more clean-up is done before exiting (*** 1 ***). Now, by quickly delivering another signal - SIGTERM (note that already delivered signal is masked and would be not delivered, so you cannot deliver SIGHUP, but there is absolutely nothing against delivering SIGTERM) - attacker might cause sighndlr() function re-entry. This is a very common condition - 'shared' handlers are declared for SIGQUIT, SIGTERM, SIGINT, and so on. Now, for the purpose of this demonstration, we would like to target heap structures by exploiting free() and syslog() behavior. It is very important to understand how [v]syslog() implementation works. We would focus on Linux glibc code - this function creates a temporary copy of the logged message in so-called memory-buffer stream, which is dynamically allocated using two malloc() calls - the first one allocates general stream description structure, and the other one creates actual buffer, which would contain logged message. Please refer the following URL for vsyslog() function sources: http://src.openresources.com/debian/src/libs/HTML/S/glibc_2.0.7t.orig%20glibc -2.0.7t.orig%20misc%20syslog.c.html#101 Stream management functions (open_memstream, etc.) can be found at: http://src.openresources.com/debian/src/libs/HTML/S/glibc_2.0.7t.orig%20glibc -2.0.7t.orig%20libio%20memstream.c.html#63 In order for this particular attack to be successful, two conditions have to be met: * syslog() data must be user-dependent (like in Sendmail log messages describing transferred mail traffic), * second of these two global memory blocks must be aligned the way that would be re-used in second open_memstream() malloc() call. The second buffer (global_ptr2) would be free()d during the first sighndlr() call, so if these conditions are met, the second syslog() call would re-use this memory and overwrite this area, including heap-management structures, with user-dependent syslog() buffer. Of course, this situation is not limited to two global buffers - generally, we need one out of any number of free()d buffers to be aligned that way. Additional possibilities are related to interrupting free() chain by precise SIGTERM delivery and/or influencing buffer sizes / heap data order by using different input data patterns. If so, the attacker can cause second free() pass to be called with a pointer to user-dependent data (syslog buffer), this leads to instant root compromise - see excellent article by Chris Evans (based on observations by Pekka Savola): http://lwn.net/2000/1012/a/traceroute.php3 Practical discussion and exploit code for the vulnerability discussed in above article can be found there: http://security-archive.merton.ox.ac.uk/bugtraq-200010/0084.html Below is a sample 'vulnerable program' code: --- vuln.c --- #include #include #include #include void *global1, *global2; char *what; void sh(int dummy) { syslog(LOG_NOTICE,"%s\n",what); free(global2); free(global1); sleep(10); exit(0); } int main(int argc,char* argv[]) { what=argv[1]; global1=strdup(argv[2]); global2=malloc(340); signal(SIGHUP,sh); signal(SIGTERM,sh); sleep(10); exit(0); } ---- EOF ---- You can exploit it, forcing free() to be called on a memory region filled with 0x41414141 (you can see this value in the registers at the time of crash -- the bytes represented as 41 in hex are set by the 'A' input characters in the variable $LOG below). Sample command lines for a Bash shell are: $ gcc vuln.c -o vuln $ PAD=`perl -e '{print "x"x410}'` $ LOG=`perl -e '{print "A"x100}'` $ ./vuln $LOG $PAD & sleep 1; killall -HUP vuln; sleep 1; killall -TERM vuln The result should be a segmentation fault followed by nice core dump (for Linux glibc 2.1.9x and 2.0.7). (gdb) back #0 chunk_free (ar_ptr=0x4013dce0, p=0x80499a0) at malloc.c:3069 #1 0x4009b334 in __libc_free (mem=0x80499a8) at malloc.c:3043 #2 0x80485b8 in sh () #4 0x400d5971 in __libc_nanosleep () from /lib/libc.so.6 #5 0x400d5801 in __sleep (seconds=10) at ../sysdeps/unix/sysv/linux/sleep.c:85 #6 0x80485d6 in sh () So, as you can see, failure was caused when signal handler was re-entered. __libf_free function was called with a parameter of 0x080499a8, which points somewhere in the middle of our AAAs: (gdb) x/s 0x80499a8 0x80499a8: 'A' , "\n" You can find 0x41414141 in the registers, as well, showing this data is being processed. For more analysis, please refer to the paper mentioned above. For the description, impact and fix information on Sendmail signal handling vulnerability, please refer to the RAZOR advisory at: http://razor.bindview.com/publish/advisories/adv_sm8120.html Obviously, that is just an example of this attack. Whenever signal handler execution is non-atomic, attacks of this kind are possible by re-entering the handler when it is in the middle of performing non-reentrant operations. Heap damage is the most obvious vector of attack, in this case, but not the only one. 2) Impact: signal in the middle (screen case) --------------------------------------------- The attack described above usually requires specific conditions to be met, and takes advantage of non-atomic signal handler execution, which can be easily avoided by using additional flags or blocking signal delivery. But, as signal can be delivered at any moment (unless explictly blocked), this is obvious that it is possible to perform an attack without re-entering the handler itself. It is enough to deliver a signal in a 'not appropriate' moment. There are two attack schemes: A) re-entering libc functions: Every function that is not listed as reentry-safe is a potential source of vulnerabilities. Indeed, numerous library functions are operating on global variables, and/or modify global state in non-atomic way. Once again, heap-management routines are probably the best example. By delivering a signal when malloc(), free() or any other libcall of this kind is being called, all subsequent calls to the heap management routines made from signal handler would have unpredictable effect, as heap state is completely unpredictable for the programmer. Other good examples are functions working on static/global variables and buffers like certain implementations of strtok(), inet_ntoa(), gethostbyname() and so on. In all cases, results will be unpredictable. B) interrupting non-atomic modifications: This is basically the same problem, but outside library functions. For example, the following code: dropped_privileges = 1; setuid(getuid()); is, technically speaking, using safe library functions only. But, at the same time, it is possible to interrupt execution between substitution and setuid() call, causing signal handler to be executed with dropped_privileges flag set, but superuser privileges not dropped. This, very often, might be a source of serious problems. First of all, we would like to come back to Sendmail example, to demonstrate potential consequences of re-entering libc. Note that signal handler is NOT re-entered - signal is delivered only once: #0 0x401705bc in chunk_free (ar_ptr=0x40212ce0, p=0x810f900) at malloc.c:3117 #1 0x4016fd12 in chunk_alloc (ar_ptr=0x40212ce0, nb=8200) at malloc.c:2601 #2 0x4016f7e6 in __libc_malloc (bytes=8192) at malloc.c:2703 #3 0x40168a27 in open_memstream (bufloc=0xbfff97bc, sizeloc=0xbfff97c0) at mem stream.c:112 #4 0x401cf4fa in vsyslog (pri=6, fmt=0x80a5e03 "%s: %s", ap=0xbfff99ac) at sys log.c:142 #5 0x401cf447 in syslog (pri=6, fmt=0x80a5e03 "%s: %s") at syslog.c:102 #6 0x8055f64 in sm_syslog () #7 0x806793c in logsender () #8 0x8063902 in dropenvelope () #9 0x804e717 in finis () #10 0x804e9d8 in intsig () <---- ** SIGINT ** #11 #12 chunk_alloc (ar_ptr=0x40212ce0, nb=4104) at malloc.c:2968 #13 0x4016f7e6 in __libc_malloc (bytes=4097) at malloc.c:2703 Heap corruption is caused by interruped malloc() call and, later, by calling malloc() once again from vsyslog() function invoked from handler. There are two another examples of very interesting stack corruption caused by re-entering heap management routines in Sendmail daemon - in both cases, signal was delivered only once: A) #0 0x401705bc in chunk_free (ar_ptr=0xdbdbdbdb, p=0x810b8e8) at malloc.c:311 7 #1 0xdbdbdbdb in ?? () B) /.../ #9 0x79f68510 in ?? () Cannot access memory at address 0xc483c689 We'd like to leave this one as an exercise for a reader - try to figure out why this happens and why this problem can be exploitable. For now, we would like to come back to our second scenario, interrupting non-atomic code to show that targeting heap is not the only possibility. Some programs are temporarily returning to superuser UID in cleanup routines, e.g., in order to unlink specific files. Very often, by entering the handler at given moment, is possible to perform all the cleanup file access operations with superuser privileges. Here's an example of such coding, that can be found mainly in interactive setuid software: --- vuln2.c --- #include #include #include void sh(int dummy) { printf("Running with uid=%d euid=%d\n",getuid(),geteuid()); } int main(int argc,char* argv[]) { seteuid(getuid()); setreuid(0,getuid()); signal(SIGTERM,sh); sleep(5); // this is a temporarily privileged code: seteuid(0); unlink("tmpfile"); sleep(5); seteuid(getuid()); exit(0); } ---- EOF ---- # gcc vuln.c -o vuln; chmod 4755 vuln # su user $ ./vuln & sleep 3; killall -TERM vuln; sleep 3; killall -TERM vuln Running with uid=500 euid=500 Running with uid=500 euid=0 Such a coding practice can be found, par example, in 'screen' utility developed by Oliver Laumann. One of the most obvious locations is CoreDump handler [screen.c]: static sigret_t CoreDump SIGDEFARG { /.../ setgid(getgid()); setuid(getuid()); unlink("core"); /.../ SIGSEGV can be delivered in the middle of user-initiated screen detach routine, for example. To better understand what and why is going on, here's an strace output for detach (Ctrl+A, D) command: 23534 geteuid() = 0 23534 geteuid() = 0 23534 getuid() = 500 23534 setreuid(0, 500) = 0 *** HERE IT HAPPENS *** 23534 getegid() = 500 23534 chmod("/home/lcamtuf/.screen/23534.tty5.nimue", 0600) = 0 23534 utime("/home/lcamtuf/.screen/23534.tty5.nimue", NULL) = 0 23534 geteuid() = 500 23534 getuid() = 0 Marked line sets uid to zero. If SIGSEGV is delivered somewhere near this point, CoreDump() handler would run with superuser privileges, due to initial setuid(getuid()). 3) Remote exploitation of signal delivery (WU-FTPD case) -------------------------------------------------------- This is a very interesting issue, directly related to re-entering libc functions and/or interrupting non-atomic code. Many complex daemons, like ftp, some http/proxy services, MTAs, etc., have SIGURG handlers declared - very often these handlers are pretty verbose, calling syslog(), or freeing some resources allocated for specific connection. The trick is that SIGURG, obviously, can be delivered over the network, using TCP/IP OOB message. Thus, it is possible to perform attacks using network layer without any priviledges. Below is a SIGURG handler routine, which, with small modifications, is shared both by BSD ftpd and WU-FTPD daemons: static VOIDRET myoob FUNCTION((input), int input) { /.../ if (getline(cp, 7, stdin) == NULL) { reply(221, "You could at least say goodbye."); dologout(0); } /.../ } As you can see in certain conditions, dologout() function is called. This routine looks this way: dologout(int status) { /.../ if (logged_in) { delay_signaling(); /* we can't allow any signals while euid==0: kinch * / (void) seteuid((uid_t) 0); wu_logwtmp(ttyline, "", ""); } if (logging) syslog(LOG_INFO, "FTP session closed"); /.../ } As you can see, the authors took an additional precaution not to allow signal delivery in the "logged_in" case. Unfortunately, syslog() is a perfect example of a libc function that should NOT be called during signal handling, regardless of whether "logged_in" or any other special condition happens to be in effect. As mentioned before, heap management functions such as malloc() are called within syslog(), and these functions are not atomic. The OOB message might arrive when the heap is in virtually any possible state. Playing with uids / privileges / internal state is an option, as well. 4) Practical considerations: timing ----------------------------------- In most cases this is a non-issue for local attacks, as the attacker might control the execution environment (e.g., the load average, the number of local files that the daemon needs to access, etc.) and try a virtually infinite number of times by invoking the same program over and over again, increasing the possibility of delivering signal at given point. For remote attacks, this is a major issue, but as long as the attack itself won't cause service to stop responding, thousands of attempts might be performed. 5) Solving signal race problems ------------------------------- This is a very complex and difficult task. There are at least three aspects of this: * Using reentrant-safe libcalls in signal handlers only. This would require major rewrites of numerous programs. Another half-solution is to implement a wrapper around every insecure libcall used, having special global flag checked to avoid re-entry, * Blocking signal delivery during all non-atomic operations and/or constructing signal handlers in the way that would not rely on internal program state (e.g. unconditional setting of specific flag and nothing else), * Blocking signal delivery in signal handlers. Michal Zalewski 16-17 May, 2001